LAB - CUDA VISION WS 21/22

Done by -

Table of Contents

Importing Modules

Datasets

The general structure of our data folder is described below. the sequences.txt has been modified and put up on Github in the folder datasets. It contained a video file that was corrupted, namely person01_boxing_d4_uncomp.avi. We remove this from the original sequences.txt for simplicity. Everything else remains the same.
To download KTH action dataset, download the dataset from the website https://www.csc.kth.se/cvap/actions/ or run the script download_kth.sh inside the KTH folder.
Initially, for the first time, change download flag to True in the load_dataset() function in utils.py for the KTH dataset. Once the .pt files have been generated, for the next time, change the download flag to False.

 |── Cuda Lab Project
   |── scripts ,.....
   |── data
    |── KTH
     |── boxing
     |── handclapping
     |── handwaving
     |── jogging
     |── running
     |── walking
     |── sequences.txt
     |── data
      |──All .pt files, processed video files
   |──MNIST
    |── raw
    |── processed
     |── moving_test.pt
     |── moving_train.pt

Path to the dataset folder which contains two seperate folders for MovingMNIST and KTH datasets

To download KTH action dataset, download the dataset from the website https://www.csc.kth.se/cvap/actions/ or run the script download_kth.sh inside the data folder.

Loading and visualising a part of the training data for Moving-MNIST dataset.
Loading and visualising a part of the training data for KTH action dataset.

Dataloaders

Creating dataloaders for Moving-MNIST dataset with a batch size of 32.
Creating dataloaders for KTH action dataset with a batch size of 32.

Experiments and Results.

Moving-MNIST experiment models trained for 50 epochs.

Instructions to run training

To train and evaluate the model use the commands listed below:

python scripts/main.py -c dataset_config.yaml --lr_warmup True --add_ssim True --criterion loss_function -s scheduler

-c corresponds to the config file , the two config files kt.yaml and mnist.yaml which are present in the configs folder.

--lr_warmup - this flag is set to True if LR warmup is to be applied to the schedulers that are used else it is set to False.

--add_ssim - this flag is set to True if SSIM is to be used as a combined loss function for training along with MSE or MAE else it is set to False.

--criterion - this corresponds to the loss function criterion which is used for training, it has two values 'mae' or 'mse'. -s corresponds to the type of scheduler that is used,its values are 'exponential' or 'plateau' for the two schedulers used are Exponential LR and ReduceLROnPlateau

This trains the frame prediction model and saves model after every 5th epoch in the model directory.

This also generates folders in the results directory for every log frequency steps. The folders contains the ground truth and predicted frames for the test dataset. These outputs along with loss are written to Weights and Biases as well.

Evaluation:

Once training is completed and the models are saved, the evaluate_model.py file can be used to calculate the following metrics for the model : MSE,MAE,PSNR,SSIM and LPIPS.

This evaluation can be run using the following command:

python scripts/evaluate_model.py -d moving_mnist -mp model_path -s tensor_saving_path

-d corresponds to the datalloader used it ,the values are 'moving_mnist' and 'kth' for the Moving Mnist and KTH Action Dataset.

-mp corresponds to the path along with the model name and type (example: models/mnist/model_50.pth) where the model is stored.

-s corresponds to the path where the tensors for the metrics are stored (example: results_eval/mnist)

Experiments Conducted All experiments are performed for 50 epochs, while the best model is again trained for 100 epochs. The various experiments performed with the Moving Mnist Dataset and their corresponding weights and biases links are provided below:

  1. Moving Mnist trained with MSE and ReduceLRonPlateau scheduler : Wandb Link, Model Link.

  2. Moving Mnist trained with MSE + SSIM loss, ReduceLRonPlateau scheduler and LR warmup : Wandb Link ,Model Link.

  3. Moving Mnist trained with MSE + SSIM loss, Exponential LR scheduler and LR warmup : Wandb Link,Model Link.

  4. Moving Mnist trained with MAE + SSIM loss, Exponential LR scheduler and LR warmup : Wandb Link ,Model Link.

  5. Moving Mnist trained with MAE and ReduceLRonPlateau scheduler : Wandb Link, Model Link.

  6. Moving Mnist trained with MSE + SSIM loss, ReduceLRonPlateau scheduler and LR warmup (Trained without context frame addition) : Wandb Link, Model Link.

  7. Moving Mnist trained with MSE + SSIM loss, ReduceLRonPlateau scheduler and LR warmup (With skip connections in encoder/decoder - vgg blocks) : Wandb Link, Model Link.

  8. Moving Mnist trained with MAE +SSIM and ReduceLRonPlateau scheduler with LR warmup : Wandb Link, Model Link.

  9. Moving Mnist trained with MSE +SSIM and ReduceLRonPlateau scheduler with LR warmup trained for 100 epochs : Wandb Link, Model Link.

Below is the model trained on MSE loss and Onplateau Schedular without using Learning rate warmup

Wandb Link

Visualising results on the test dataset. The first line depicts the target sequence and the second depicts the frames our model predicted.

Moving Mnist trained with MSE + SSIM loss, ReduceLRonPlateau scheduler and LR warmup

Wandb Link

Moving Mnist trained with MSE + SSIM loss, Exponential LR scheduler and LR warmup

Wandb Link

Moving Mnist trained with MAE and ReduceLRonPlateau scheduler :

Wandb Link

Moving Mnist trained with MAE + SSIM loss, Exponential LR scheduler and LR warmup

Wandb Link</h5>

Moving Mnist trained with MAE +SSIM and ReduceLRonPlateau scheduler with LR warmup

Wandb Link

KTH Action Experiment models

The various experiments performed with the KTH Action Dataset and their corresponding weights and biases links are provided below:

  1. KTH Action Dataset trained with MSE loss and ReduceLRonPlateau scheduler : Wandb Link, Model Link.

  2. KTH Action Dataset trained with MSE + SSIM loss, ReduceLRonPlateau scheduler and LR warmup : Wandb Link, Model Link.

  3. KTH Action Dataset trained with MSE + SSIM loss, Exponential LR scheduler and LR warmup : Wandb Link, Model Link.

  4. KTH Action Dataset trained with MAE + SSIM loss, Exponential LR scheduler and LR warmup : Wandb Link, Model Link.

  5. KTH Action Dataset trained with MAE and ReduceLRonPlateau scheduler : Wandb Link, Model Link.

  6. KTH Action Dataset trained with MAE+SSIM , ReduceLRonPlateau scheduler and LR warmup : Wandb Link, Model Link.

Below is a model trained on the KTH Action dataset using MSE as a loss criterion along with OnPlateau Schedular without learning rate warmup.

Wandb Link

Visualising results on the test dataset. The first line depicts the target sequence and the second depicts the frames our model predicted.

KTH trained with MSE +SSIM and ReduceLRonPlateau scheduler with LR warmup:

Wandb Link

KTH trained with MSE +SSIM and ExponentialLR scheduler with LR warmup:

Wandb Link

KTH trained with MAE and ReduceLRonPlateau scheduler without LR warmup:

Wandb Link

KTH trained with MAE +SSIM and ReduceLRonPlateau scheduler with LR warmup:

Wandb Link

KTH trained with MAE +SSIM and ExponentialLR scheduler with LR warmup:

Wandb Link

Best Performing models

Best performing model for Moving-MNIST was found to be the one trained on combined loss (MSE + $\lambda$ * SSIM). More details are given in the report. Below are the results for it from a test sample.

Wandb Link,Model Link.

Best performing model for KTH action dataset was found to be the one trained on combined perceptual loss (MSE + $\lambda$ * SSIM) trained for 50 epochs. More details are given in the report. Below are the results from it from a test sample.

Wandb Link , Model Link.

Statistics

table_1.JPG

table 2.JPG

Moving Mnist The best results are obtained when the model is trained with a combined loss, i.e. MSE with SSIM loss along with using the ReduceLROnPlateau Scheduler and a LR warmup that decreases the loss by a factor of 0.1 if the model plataeaus for a total(patience level) of 10 epochs which helps in improving results if the model reaches a plateau stage.

Our best model achieves low SSIM when trained with MSE and SSIM loss functions. Frames predicted with MAE as loss criterion disregard the second digit sometimes which leads to perfect black backgrounds as opposed to the ones generated using MSE loss that have grayish noise in the image. We suspect this to be the reason that SSIM metric performs well with MAE loss in the case of Moving-MNIST dataset.

KTH dataset: The best performing model is trained using MSE and SSIM as a loss function along with ReduceLROnPlateau scheduler. This model achieves the highest SSIM measure of 0.77 and high a high PSNR value too, although not the highest. LPIPS value is also the second best (0.239) for this model. Qualitatively, this model gives us best results

References